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Abstract 

Information embedding (IE) is the transmission of information within a host signal subject to a 
distortion constraint. There are two types of embedding methods, namely irreversible IE and reversible 
IE, depending upon whether or not the host, as well as the message, is recovered at the decoder. In 
irreversible IE, only the embedded message is recovered at the decoder, and in reversible IE, both the 
message and the host are recovered at the decoder This paper considers combinations of irreversible 
and reversible IE in multiple access channels (MAC) and physically degraded broadcast channels (BC). 

This paper first considers MAC IE in which separate encoders embed their messages into their host 
signals subject to distortion constraints. The embedded signals from the two encoders are transmitted to 
a single decoder across a MAC. This paper study the capacity region in three cases: A) no host recovery 
at the decoder, B) lossless recovery of one host at the decoder, and C) lossless recovery of both hosts 
at the decoder. For the cases A and B, inner bounds on the respective capacity regions are developed. 
For the case C, inner and outer bounds on the capacity region are developed and the capacity region is 
obtained if the hosts are independent. 

This paper also considers BC IE in which two messages intended for separate decoders are embedded 
into a given host sequence by a single encoder subject to a distortion constraint. This paper study the 
capacity region for degraded BC in four cases: A') lossless recovery of the host sequence at neither 
of the decoders, B') lossless recovery of the host sequence at only the better decoder, C) lossless 
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recovery of the host sequence at both decoders, and Z?') lossless recovery of the host sequence at only 
the worse decoder. For the cases A' and B', inner and outer bounds on the respective capacity regions 
are developed. For the cases C and D', the respective capacity regions are obtained. 

Index Terms 

Information Embedding, Reversible Information Embedding, Multiple Access Channels, Broadcast 
Channels 



Information embedding (IE) is the reliable transmission of information within a host signal 
subject to a distortion constraint. IE is a recent area of digital media research with many 
applications including active and passive copyright protection (digital watermarking); steganog- 
raphy; embedding important control, descriptive reference information into a given signal; digital 
upgrades of communication infrastructure; and covert communications [1], [2], [3], [4]. The main 
idea of IE is that the host signal can carry different messages at the same time by allowing a 
small amount of distortion that can be tolerated at the intended receiver for the host signal. It 
has been observed that IE is closely related to state-dependent channel models with state known 
non-causally at the encoder [5], [6] [1], [2], [7]. 

A. Forms of IE 

In IE, a message W is embedded into a host signal S'* such that the embedded signal X" is 
close to S" under some prescribed distortion measure •), i.e., Ed(X", S") < A. The decoder 
receives Y", which is drawn according a probability law p(y"|x",s") for given X" and S". 
Throughout the paper, we focus on the discrete memoryless case without feedback and denote 
the channel law by p{y\x, s). Based upon whether or not the decoder recovers the host signal 
in the sense of probability of error going to zero, there are two important types of IE, namely 
irreversible and reversible IE. 

In irreversible IE, the decoder is only concerned with reliable decoding of the message 
embedded in the host from the received sequence Y" [1], [2], [7], [8]. The irreversible IE 
capacity of a single-user model is given by 



I. Introduction 



C{A) 



max 

p(ti,x|s): Ed(X,S)<A 



[I(U;Y)-I(U;S)], 
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where U is an auxiliary random variable with lUI < |X||§|. To achieve the capacity, Gel'fand- 
Pinsker coding [5] is used at the encoder such that the distortion between X" and S" satisfies 
the constraint A. 

In reversible IE, the decoder is concerned with lossless recovery of the host as well as 
reliable decoding of the embedded message in the host from the received sequence Y" [9], 
[10]. Reversible IE is useful for cases in which little or no degradation of the host signal is 
allowed, with applications in military and medical imagery, and multimedia archives of valuable 
original works. The reversible IE capacity is given by 

C(A)= max [I(X, S; Y) - H(S)]. 

p(x|s): Ed(X,S)<A 

To achieve the above capacity expression, superposition coding is used at the encoder such that 
the distortion constraint is satisfied, i.e., E[(i(X, S)] < A. 

This paper focuses on IE in multi-user channels such as multiple access channels (MAC) and 
broadcast channels (BC). We focus on MAC IE with lossless recovery of some host sequences 
at the decoder and BC IE with lossless host recovery at some decoders, but the techniques 
can also be applied to other multi-user scenarios. In single-user IE, substantial results have been 
developed, but multi-user IE scenarios have not been as extensively studied. Information theoretic 
study of single-user public and private watermarking systems is studied in [11], [12], [13]. Joint 
IE and lossy compression is studied in [14], [15] and joint watermarking and encryption is studied 
in [16]. Multi-user models with state available at the encoders are studied in [17], [18], [19], 
[20], [21], [22], [23], [24], [25], [26], [27]. As in single-user case, there is a close relationship 
between multi-user models with non-causal state at the encoders and multi-user IE. 

B. Summary of Results 

1 ) MAC IE: In Section HIl we consider a two-user MAC IE model shown in Figure [H but 
the results can be extended to any number of users. Encoder i embeds its information Wj into a 
host signal S", generated by a host source i, such that the per-letter distortion between S" and 
X^ is less than A^, i = 1, 2. 

For this model, we consider the following three cases in recovering, in the sense of probability 
of error going to zero, the messages and the host sequences at the decoder from the received 
sequence Y"^: 
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Fig. 1. Block diagram of multiple access channel information embedding model. 



• Case A, Recovery of Neither Host: The decoder recovers (Wi, W2) from Y". 

• Case B, Recovery of One Host: The decoder recovers (Wi, W2) along with the one host 
from Y". Without loss of generality, we can assume that the host sequence S2 of Encoder 2 
is recovered at the decoder. 

• Case C, Recovery of Both Hosts : The decoder recovers (Wi, W2) and (S", S2) from Y". 
Our general MAC IE model considers scenarios in which the MAC output potentially depends 
on both the embedded signals and the host signals. For Cases A and B, we develop inner bounds 
on the respective capacity regions in Sections III-AI and III-B[ respectively. For Case C, we derive 
inner and outer bounds on the capacity region if the hosts are correlated in Section III-C[ and we 
show that there is no gap between the inner and the outer bounds if the hosts are independent. 

2) BC IE: In Section [nil we consider IE in a broadcast scenario as shown in Figure [2l 
which illustrates only two decoders; in principle the model and results can be extended to any 
number of decoders. In this model, the encoder embeds two independent messages (Wi,W2) 
into a single host sequence S" such that the distortion between the embedded signal X" and 
S" satisfies a given distortion constraint A. In this paper, we focus on the case of a degraded 
broadcast channel, i.e., p{y, z|x, s) = p(y |x, s)p(z|v)). Decoder 1, or the better decoder, receives 
the channel output Y" which is drawn according to a memoryless probability law p(ij|x, s) for 
given X" and S". Decoder 2, or the worse decoder, receives the sequence Z" which is corrupted 
version of Y". 

For this model, we consider the following four cases in recovering, in the sense of probability 
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Fig. 2. Block diagram of the broadcast information embedding model. 



of error going to zero, the messages and the host sequences at the decoders: 

• Case A', No Host Recovery: Decoder 1 recovers (Wi, W2) from Y"; Decoder 2 recovers 
W2 from Z". 

• Case B', Host Recovery at the Better Decoder: Decoder 1 recovers (Wi,W2) and S" 
from Y"; Decoder 2 recovers W2 from Z". 

• Case C, Host Recovery at Both Decoders: Decoder 1 recovers (Wi,W2) and S" from 
Y"; Decoder 2 recovers W2 and S" from Z"^. 

• Case D', Host Recovery at the Worse Decoder: Decoder 1 recovers (Wi,W2) from Y"; 
Decoder 2 recovers W2 and S" from Z". 

Inner and outer bounds for the BC IE capacity region in Case A' without an encoder distortion 
constraint are derived in [21]; in this paper, we extend the results to incorporate an encoder 
distortion constraint in Section IIII-A[ For Case B', we develop inner and outer bounds for the 
BC IE capacity region in Section IIII-Bl and for cases C and D' we derive the BC IE capacity 
region in Section IIII-CI and Section IIII-D[ respectively. It turns out that the capacity regions in 
Cases C and D' are identical because the channel output Z" is a degraded version of Y". The 
capacity region for the model considered in Case C if compressed hosts are available at the 
decoders is obtained in [28]. 

C. Notation 

Throughout the paper, random variables and sample values are denoted in a special font, e.g., 
random variable X and sample value x. Alphabets are denoted in calligraphic font, e.g., X, and are 
all discrete. The shorthand X" represents the sequence Xi,i, Xi^2, • • • , Xi^„, and X" ^ represents the 
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sequence Xi_j, Xi j+i, . . . , Xi „. Finally, HI(-) and !(■; ■) denote the standard information-theoretic 
quantities of (ensemble average) entropy and mutual information, respectively. 

II. MAC IE 

In this section, let us formally discuss the model shown in Figure [TJ Host source i generates 
a sequence S" = SjiSj2 • • • of symbols from the discrete alphabet Sj, i = 1,2. We assume 
that the host sequence pair (S", Sg) is generated by repeated independent drawings of a pair of 
discrete random variables (Si,S2) from a given joint distribution p(si,S2). The host sequence 
S" is non-causally known at Encoder i for i = 1, 2. The message source at Encoder i produces 
the message index Wj G Wj = {1,2,..., Mi} with equal probability 1/Mj, for i = 1,2. The 
message index at any encoder is independent of all host sequences and also independent of the 
messages at all other encoders. The rate at Encoder i, in bits per channel use, is defined as 
Ri = (1/n) log2(M,). 

Definition 1: A {Mi, M2, D^^'^ , D^\n) MAC IE code consists of sequences of encoding 
functions at Encoder 1 and Encoder 2, 

: Wi X S^^ ^ and : W2 x §^ ^ X^, 

respectively, and a sequence of decoding functions, 
. Recovery of Neither Host gl : T ^ (Wi, W2) 
. Recovery of One Host gl : T ^ (Wi,W2,S5) 
. Recovery of Both Hosts g^ : T ^ (Wi, W2, S^) 

The distortions associated with MAC IE code are defined as ^ = Ec/i(S^, X^) for the additive 
distortion function 

1 " 

c^ilS^^D = - '^^di{Sij,Xij) 

^ J=l 

for some non-negative bounded distortion functions di{Sij,Xij), where i = 1,2. 

The embedded signals X" and X2 from Encoder 1 and Encoder 2, respectively are trans- 
mitted across a MAC p{\^\xi, Si,X2, S2) without feedback modeled as a memoryless conditional 
probability distribution 

n 

Pr(y''K,S^,X2>S2) = Wp{-\^j\Xij,Sij,X2j,S2j). (1) 
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Definition 2: A rate pair (i?i,i?2) for a given distortion pair (Ai, A2) is said to be MAC 
IE achievable if there exists a sequence of ([2'*-^i], \2'^^^], D^^\ D^^\n) MAC IE codes with 
lim„_^oo A'"^ ^ fori — 1,2, and lim„_>oo -PJ* — 0> where PJ* is the probabiUty of error 
defined appropriately for each case in the sequel of this section. 

Definition 3: For given p(si,S2) and p(v|xi, Si, X2, S2), let (P^acC^I'^s) be the set of all 
random variable tuples (Q, Si, S2, (Ui, Xi), (U2, X2), Y) taking values in finite alphabets Q, S, 
111 X Xi, 112 X X2, and respectively, with joint distribution satisfying conditions 

a) Eq,(ui,Xi),(u,,x,).yKq^Si,S2. (Ui,Xi), (Us^Xs),^) =^(81,82). 

b) p(q,Si,S2, (ui,xi), (u2,X2),v) = p(q)p(si, S2)p(ui,Xi|si, q)p(u2,X2|s2, q)p(y |xi, Si,X2, Sa) 

c) Edi{Si,Xi)<Ai, fori ^1,2. 

Definition 4: For given p(si,S2) and p(v|xi, Si, X2, X2), let 'J'^j^q{Ai, A2) be the set of all 
random variable tuples (Q, Si, S2, Xi, X2, Y) taking values in finite alphabets Q, S, Xi, X2, and 
y, respectively, with joint distribution satisfying the conditions 

a) - Eq,xi,X2,vMq>Si,S2,Xi,X2,y) =p(Si,S2), 

b) . p(q, si, $2, xi, X2, -y) = p(q)p(si, S2)p(xi, X2|si, S2, q)p(y |xi, Si, X2, S2), 

c) . Edi(Si,Xi)<Ai, fori = 1,2. 

A. Recovery of Neither Host 

In this section, we derive an inner bound on the MAC IE capacity region for Case A, in 
which the decoder recovers only (Wi,W2) from Y". We define the MAC IE capacity region 
Cmac,a(^1' ^2) as the closure of the set of all MAC IE achievable rates {Ri, R2) with Pi"^ := 
¥[{g'\{y'^) ^ (Wi, W2)] ^ as n ^ 00. The following theorem provides an inner bound on 
the capacity region. 

Proposition 1: Let 3^}y[AC,A(^i> ^2) be the closure of the set of all rate pairs (^1,^2) such 
that 

Pi <I(Ui;U2,Y|Q)-I(Ui;Si|Q), (2a) 

P2 < I(U2;Ui, Y|Q) - I(U2;S2|Q), (2b) 

Pi + P2 < I(Ui, U2; Y| Q) - I(Ui, U2; Si, S2IQ) (2c) 

for some (Q, Si, S2, (Ui, Xi), (U2, X2), Y) e J'mac(^i> ^2), where Ui and U2 are auxiliary 
random variables. Then, 3^mac,a(^) ^ Gmac,a- 
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Remarks 

• The inner bound in Proposition [T] is similar to that in [29], which considers a Gaussian MAC 
with no host recovery, but the result here is for the discrete memoryless case. Because the 
coding procedures, and error events in [29] apply, we do not provide a proof here. 

• To achieve the inner bound, distortion-constrained Gel'fand-Pinsker codes can be used to 
embed Wi and W2 into the host sequences S" and such that the distortion constraints 
Ai and A2 are met, respectively. 

B. Recovery of One Host 

In this section, we derive inner and outer bounds on the MAC IE capacity region for Case B, 
in which the decoder recovers (VVi,W2,S2) from Y". We define the MAC IE capacity region 
Cmac,b(Ai, A2) as the closure of the set of all MAC IE achievable rates R2) with Pi"'' : = 
P[((7^(Y") 7^ (Wi, W2, S2)] ^ as n ^ 00. The following theorem provides an inner bound 
for the capacity region. 

Proposition 2: Let D^J^^q ^(Ai, A2) be the closure of the set of all rate pairs (_Ri,i?2) such 
that 

i?i < I(Ui; YIX2, S2, Q) - I(Ui; S1IX2, S2, Q), (3a) 
R2 < I(X2, S2; Y|Ui, Q) - H(S2|Ui, Q), (3b) 
R, + R2< I(Ui, X2, S2; Y|Q) - H(S2) - I(Ui; S1IX2, S2, Q) (3c) 

for some (Q, Si, S2, (Ui, Xi), (X2, X2), Y) G !Pma^c(Ai, A2), where Ui and Q are auxiliary 

random variables. Then, Ji\^j^^^^{Ai, A2) C CmacbIAi, A2) 

Remarks 

• The inner bound in Proposition|2]is a special case of an inner bound in [24], which considers 
the state-dependent MAC with state known at one encoder and recovery of only messages 
at the decoder. To obtain the inner bound in Proposition [2l substitute (X2,S2) in place of 
X2 into the inner bound in [24] . 

• To achieve the inner bound, distortion constrained Gel'fand-Pinsker coding is used to embed 
Wi into the host sequence S", and distortion-constrained superposition coding is used to 
embed W2 into the host sequence $2- 
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• If we choose U2 = (X2, S2) int Proposition [U we obtain the inner bound in Proposition [2l 
Thus, 3^mac,b(^17 ^2) ^ 3^mac,a(^1) ^2)- 

C. Recovery of Both Hosts 

In this section, we derive inner and outer bounds on the MAC IE capacity region for Case C 
in which the decoder recovers (Wi, S", W2, S2) from Y". We define the MAC IE capacity region 
eMAc,c(Ai, A2) as the closure of all MAC IE achievable rates R2) with P]"^ := P[(^(Y") ^ 
(Wi, S", W2, S2)] ^ as n ^ 00. The following theorem obtains an inner bound for the capacity 
region. 

Theorem 1: Let D^^^c q(Ai, A2) be the set of all rate pairs (i?i,i?2) such that 

Ri < [I(Xi,Si;Y|X2,S2,Q)-e(Si|S2)], (4a) 

R2 < [I(X2, S2; Y|Xi, Si, Q) - e(S2|Si)], (4b) 

R, + R2 < [I(Xi,Si,X2,S2;Y|Q)-e(Si,S2)], (4c) 
for some (Q, Si, S2, (Xi, Xi), (X2, X2), Y) e T*mac(Ai, A2). Then, 

-^MAC,C 

(Ai,A2) ^ ^MAC,C (Ai,A2). 

Proof: See Appendix |A] 

The following theorem gives an outer bound for the capacity region if Si and S2 are correlated. 

Theorem 2: Let DImac,c(Ai, A2) be the set of all rate pairs (i?i, R2) such that 

Ri < [I(Xi,Si;Y|X2,S2,Q)-e(Si|S2)], (5a) 

R2 < [I(X2, S2; Y|Xi, Si, Q) - e(S2|Si)], (5b) 

i?i + i?2 < [I(Xi,Si,X2,S2;Y|Q)-e(Si,S2)], (5c) 

for some (Q, Si, S2, Xi, X2, Y) G T^AclAi, A2). If the host random variables Si and S2 are 
correlated, then 

*-MAC,C (Ai,A2) 

— "^MAC,C 

(Ai,A2). 

If the host random variables Si and S2 are independent, then 

(Ai,A2) 

— -^MACC 

(Ai,A2). 
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Proof: See Appendix |B] 

The following corollary of Theorem [T] and Theorem [2] states the MAC IE capacity region 
for a given pair of distortion constraints (Ai, A2) if the host random variables Si and S2 are 
independent. 

Corollary 1: If the host random variables Si and S2 are independent, then the capacity region 
Cmac,c(Ai, ^2) is the closure of the set of all rate pairs (i?i, R2) such that 

R, < [I(Xi, Si; YIX2, S2, Q) - e(Si|S2)], (6a) 
R2 < [I(X2, S2; Y|Xi, Si, Q) - e(S2|Si)], (6b) 
Ri + R2< [I(Xi, Si, X2, S2; Y|Q) - e(Si, S2)], (6c) 

for some (Q, Si, S2, (Xi, Xi), (X2, X2), Y) G ^^0(^1, ^2). 
Remarks 

• To compute either dH) or ([5]), it is sufficient to consider time- sharing random variable Q 
with |Q| < 4 by Caratheodory's theorem [30]. 

• In most communication scenarios, message transmission rates of zero are achievable. How- 
ever, in this model, message transmission rates of zero can be unachievable if the host source 
pair p{si, $2) is such that the upper bounds on Ri, R2 and -Ri + -R2 in © are negative. 
This is because we require host recovery at the decoder as well. 

III. Degraded BC IE 

In this section, let us formally define the BC IE model shown in Figure [21 A host sequence S" = 
(Si, S2, . . . , S^) is an independent and identically distributed (i.i.d.) discrete random sequence 
whose elements are drawn with probability mass function p{s), s G S. All alphabets are discrete. 
We assume that the host sequence S" is non-causally known at the encoder. The encoder embeds 
a message pair (Wi, W2) into the host sequence S" such that the average distortion between S" 
and the embedded sequence X" satisfies a given distortion constraint A. The messages Wi G 
{1, 2, . . . , Ml} and W2 G {1, 2, . . . , M2} are drawn equally likely with probabilities 1/Afi and 
I/M2, respectively. Then the rate of message Wj is given by Ri = (l/n) logg Mj bits per channel 
use, for i = 1,2. It is also assumed that the message Wj is independent of the other message 
and the host sequence for i = 1,2. 
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Definition 5: A {Mi, M2, D^^\n) BC IE code consists of a sequence of encoding functions 
at the encoder 

/" : Wi X W2 X S" ^ X", 

and a sequence of decoding functions at Decoder 1 and Decoder 2 
. No Host Recovery : T ^ (Wi, W2) and 5^^, : ^ W2 
. Host Recovery at the Better Decoder g^^, : T ^ (Wi, W2, S") and g^B, : 2'* ^ W2 
. Host Recovery at Both Decoders gl^' ■ T ^ (Wi, W2, S") and g^c' ■ 2^" ^ (W2, S") 
. Host Recovery at the Worse Decoder glj^, : T ^ (Wi,W2) and g^j^, : (W2,S"), 

respectively. The associated distortion is defined as D^'^^ = E(i(S",X"), where (i(S",X") = 

(1/n) X]j=i d{Sj, Xj) for given non-negative bounded distortion measure d{-, ■). 
The embedded signal X" is transmitted across a discrete memoryless degraded broadcast 

channel (DMDBC) with state, p{y\x, s)p{z\y), modeled as a memoryless conditional probability 

distribution 

n 

Definition 6: A rate pair R2) for a given distortion A is said to be BC IE achievable if 
there exists a sequence of ([2"^!], \2''^^'],D^''\n) BC IE codes with lim^^oo -D^"") < A and 
lim„_»oo = 0, where Pg" is the probability of error defined appropriately for each case in the 
sequel of the paper. 

Definition 7: For a given p(s) and p{y\x,s)p{z\y), let J'(A) be the collection of random 
variables (T, S, X, Y, Z) with joint probability mass function satisfying the following conditions 

a) p(t,s,x,-y,z) =p(t,s,x)p(y|x,s)p(z|y) 

b) EteT,x6xP(t, ^, s) = p(s) 

c) Ed(S, X) < A, 

where T is an auxiliary random variable. 

A. No Host Recovery 

In this section, we state inner and outer bounds for the BC IE capacity region in Case A', 
in which Decoder 1 recovers (Wi, W2) from Y" and Decoder 2 recovers W2 from Z". The BC 
IE capacity region Ca'(A) is the closure of all BC IE achievable rates {Ri,R2) with Pi"^ := 
Pr[(^M'(Y") 7^ (Wi, W2) or ^^,^,(Z") ^ W2] ^ as n ^ 00. 
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Proposition 3: Let CR^,(A) be the closure of the set of all rate pairs (Ri, R2) such that 

i?i < I(V;Y|U) -I(V;S|U), (8a) 

i?2 < I(U;Z) -I(U;S), (8b) 

for some ((U, V), S, X, Y, Z) G J'(A), where U and V are auxiliary random variables with 
alphabet sizes satisfying |1X| < |X||S| + 1 and |V| < |X||S|(|X||§| + 1), respectively. Let ^%{A) 
be the closure of the set of all rate pairs (-Ri, R2) such that 

i?i < I(V; Y|U,W) -I(V;S|U,W), (9a) 

i?2 < I(U;Z) -I(U;S), (9b) 

Ri + R2< I(U, V, W; Y) - I(U, V, W; S), (9c) 

for some ((U, V, W), S, X, Y, Z) G J'(A), where U, W, and W are auxiliary random variables 
with alphabet sizes satisfying |1X| < |X||§| + 2, |V| < |X||S|(|X||S| + 2) + 1, and W < 
(|X||S|(|X||S| +2) + 1)(|X||S| + 2)|X||S| + 1, respectively. Then, T^,{A) C e^,(A) C 01%{A). 
Remarks 

The inner and outer bounds in Proposition [3] are slightly different from those in [21], which 
does not consider an encoder distortion constraint. Although essentially the same proofs in [21] 
apply, here there is an additional constraint on the joint probability mass functions ^(A) to limit 
the average distortion between the host S and the channel input X to be at most A. To achieve 
the inner bound, Gel'fand-Pinsker codes can be used to embed the messages (Wi, W2) into the 
host sequence S". 

B. Host Recovery at the Better Decoder 

In this section, we derive inner and outer bounds on the BC IE capacity region in Case B' , 
in which Decoder 1 recovers (Wi, W2) and S" from Y" and Decoder 2 recovers only W2 from 
Z". We define the BC IE capacity region Cb/(A) as the closure of all BC IE achievable rates 
(i?i,i?2) with Pi"^ := Pr[(^^^,(Y") ^ (Wi, W2,S") or ^2.b'(2") ^ W2] ^ as n 00. The 
following two theorems give inner and outer bounds for the capacity region in this case. 

Theorem 3: Let 3?^, (A) be the closure of the set of all rate pairs (-Ri, R2) such that 

i?i < I(X,S;Y|U) -H(S|U), (10a) 
i?2 < I(U;Z) - I(U;S), (10b) 
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for some (U, S,X, Y, Z) G J'(A), where U is an auxiliary random variable with alphabet size 
satisfying |U| < |X||§| + 1. Then ^ ^^/(A). 

Proof: See O . 

Theorem 4: Let 3?^, (A) be the closure of the set of all rate pairs (-Ri, R2) such that 

i?i < E(X,S;Y|U) -e(S|U), (11a) 

i?2 < I(U,V;Z) -I(U,V;S), (lib) 

for some ((U, V), S, X, Y, Z) G ^(A), where U and V are auxiliary random variables with 
alphabet sizes satisfying lUI < |X||§| + 1 and |V| < |X||S|(|X||§| + 1), respectively. Then 

eB'(A) c 3^^, (A). 

Proof: See Appendix iDl 
Remarks 

To obtain the above inner bound, the message W2 is embedded into the host sequence S" 
using Gel'fand-Pinsker coding, and the message Wi is embedded into the host sequence using 
superposition coding such that the distortion constraint is satisfied. The above inner and outer 
bounds are already convex regions. So, there is no need to introduce time-sharing auxiliary 
random variables. Let us write the constraint on R2 in the outer bound given in (fTT)) as follows 

I(U, V; Z) - I(U, V; S) = I(U; Z) - I(U; S) + {I(V; Z|U) - I(V; S|U)}. 

This term I(V; Z|U) — I(V; S|U) is the difference between the inner and outer bounds. If V is 
a deterministic function of U, both inner and outer bounds coincide. This clearly shows that 

%,{^) C 3^^, (A). 

C. Host Recovery at Both Decoders 

This section derives the BC IE capacity region in Case C, in which Decoder 1 recovers 
(Wi,W2) and S" from Y" and Decoder 2 recovers W2 and S" from Z". We define the BC 
IE capacity region (^^/(A) as the closure of all BC IE achievable rates (_Ri,i?2) with Pi"'' : = 
Pr[K,c'(Y") ^ (Wi, W2,S") or ^72V(Z") ^ (W2,S")] ^ as n ^ 00. 

Theorem 5: Cc"(A) is the closure of the set of all rate pairs R2) such that 

i?i < I(X; Y|U,S), (12a) 
i?2 < I(X,S;Z) -H(S), (12b) 
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for some (U, S,X, Y, Z) G J'(A), where U is an auxiliary random variable with lUI < |X||S|. 

Proof: See Appendix |E] 

Remarks 

To achieve the BC IE capacity region, the messages (Wi,W2) are embedded into the host 
sequence using distortion-constrained superposition coding as in the previous cases because 
lossless recovery, i.e., reversible embedding, of the host sequence S" is required in Case C. 

D. Host Recovery at the Worse Decoder 

This section derives the BC IE capacity region in Case D' , in which Decoder 1 recovers 
(Wi, W2) from Y" and Decoder 2 recovers W2 and S" from Z". We define the broadcast IE 
capacity region C£i/(A) as the closure of all BC IE achievable rates {R\,R2) with Pi"^ : = 
Vi[{glj,,iy^) ^ (Wi, W2) or (72",D'(2") ^ (W2,S")] ^ as n ^ 00. 

Corollary 2: Goi^) = GdA). 
Proof: Since Z" is a degraded version of Y", and (W2, S") must be reliably decoded from Z", 
(W2, S") can also be decoded from Y". This implies that the BC IE capacity region in Case D' 
is the same as in Case C. 



Appendix 

We present definitions related to strong typicality [30], [31], [32] and important theorems 
based on strong typicality which will be used throughout the section. 

Definition 8: A sequence x" G X" is said to be e-strongly typical with respect to a distribution 

p(x) on X or x" G T;'(X) if 



-Ar(a|x") -p(a) 

n 



< 



ixr 



for all a G X with p{q) > 0, and A^(a|x") = for all a G X with p(a) = 0, where A^(a|x") is 
the number of occurrences of the symbol a in the sequence X". 

Definition 9: A pair of sequences (x",^") G X" x is said to be jointly e-strongly typical 
with respect to a distribution p{x,y) on X x y or (x",^") G T"(x,ij) if 



-iV(a,b|x^-y")-p(a,b) 
n 



< 
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for all (a,b) G X X y with p{a,b) > 0, and iV(a,b|x",-y") = for all (a,b) e X x ^ with 
p(a, b) = 0, where N{a, b|x", y") is the number of occurrences of the symbol (a, b) in the pair 
of sequences (x",v"). 

For completeness, we recall theorems on strong typicality [30], [31], [32] which will be used 
throughout this section. 

Lemma 1: Suppose X" is generated from a discrete memoryless source(DMS) p(x) and X" e 
T"(X). Then, we have the following 

2-n[H(X)+ei] ^ p^^x"") < 2~"P(-^)~^il (13) 

(1 - €2) 2"P(^)-'iJ < \T^{X)\ < 2"P(x)+^i] (14) 

(1 - €2) < Pr[X" e T^{X)] < 1 (15) 

where ei ^ as e ^ 0, and 62 — > as n — > 00 for fixed e. 

Lemma 2: Suppose (X", Y") is generated from a discrete memoryless source (DMS) p(x,t|) 
and (x",y") e T^{X,Y) and Then, we have the following 

2-n[H(X,Y)+ei] ^ P"(x",y") < 2-"P(^'Y)-41 (16) 

(1 - e'a) 2''P('^'^~'i] < |T"(X, Y)| < 2"P('^'^+'il (17) 

(1 - e'2) < Pr[(X", Y") e r,"(X, Y)] < 1 (18) 

where e'^^ ^ as e ^ 0, and 63 ^ as n ^ 00 for fixed e. 
Lemma 3: Suppose (X**, Y**) is generated from a discrete memoryless source(DMS) p{x,y) 



and (X", Y") e r,"(X, Y). Then, we have the following 

2-n[H(Y|X)+e'/] ^ p"(y«|^") < 2-"P(Y|X)-4'l (19) 

(1 - e^') 2"P(Y|x)-6'i] <; |2;n(x, Y|x")| < 2"P(Y|x)+6'/] (20) 

(1 - e^') < Pr[(x", Y'*) e r,"(X, Y)] < 1 (21) 



where e^' ^ as e ^ 0, and e^' ^ as n ^ 00 for fixed e, and T^{X, Ylx'^) = {y " : (x", y") e 

Tr(x,Y)}. 
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A. Proof of Theorem [7] 

In this section, we demonstrate existence of a sequence of MAC IE codes 
([2"^i], \T'^''^,D^^\D^^\n) with lim„_oo ^e" = 0, and Um^^oo A^"^ < for i = 1,2 if the 
rate pair (i?i,i?2) satisfying ©. Fix (Q, Si, S2, (Xi, Xi), (X2, X2), Y) G yUc(Ai,A2) and n. 
We construct a MAC IE code ([2"^^], [2"^^^ , n) as follows. 

• Code construction: Throughout the achievability proof, let i E J = {1,2}. Generate 
time sharing sequence Q" = (Qi, Q2, . . . , Qn) whose elements are i.i.d. with distribution 
p(q). At Encoder i, for each s" G S", generate [2"^'] X" sequence drawn according 
to nj=iP(^ii|sii, qi)- Call these sequences X"(Q",S",mi) where mj G {1, 2, . . . , 2"-^'}, 
i = 1,2. In this way, the codebooks are generated at each encoder and revealed to the 
decoder. 

Since the sequence Q" serves as time sharing sequence, it can be assumed that the sequence 
Q" is known at both the encoders and at the decoder without loss of generality. 

• Encoding: Encoder i, upon observing S" at the output of host source i and time sharing 
random sequence Q", sends message Wj G {1,2,..., [2"^*] } by transmitting the codeword 
X"(Q", S", Wj). In this way, the codeword X" is chosen and transmitted from Encoder i 
for a given time sharing sequence Q", a given host sequence S", and a message Wj. 

• Decoding: Fix < ci < e. Since the decoder knows the time sharing sequence Q" = q", the 
decoder, upon receiving the channel output Y", looks for a tuple (X"(q", s", mi), X2 (q", S2 , ^2)) 
such that (X^(q",s^,mi),X^(q",s^,m2),Y") G T;^[Q, Si, S2, Xi, X2, Y|q", s^, s^] for all 
(51,82) G TJ|[Si,S2]. If a unique vector of sequences exists, the decoder declares that 

(Wi,W2, 

^1' ^2) ~ {f^i-,f^2-, Si, 52). Otherwise, the decoder declares an error. In this way, 
the messages and the host sequences are decoded at the decoder. 

• Probability of error: The average probability of error is given by the following 

P: = p(qXs^s^)Pr[error|(s^s^,q")] 
< Yl P(q")p(s?,s^) 

+ Yl p(s^s^)p(q")Pr[error|(s^s^,q")] (22) 

(q",s^s5)GT,'j[Q,Si,S2] 
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The first term, Pr[(q", s", Sg) ^ [Q, Si, S2]], in the right hand side expression of (|22|) 
goes to zero as n — > 00 by Lemma [2l 

Without loss of generality, it can be assumed that the time-sharing sequence is q", the 
output of the host source z is s", and Wj = 1 is being transmitted from Encoder i. Hence, 
the codeword X"(q", s", 1) is transmitted from Encoder i. It is also assumed that the time- 
sharing random sequence Q" = q" is known at both the encoders and the decoder. Let F 
be the event that (5", §2) and q*^ are the output of the host source pair and time sharing 
sequence, respectively and (q", s", $2) G T"^[Q, Si, S2]. 

The following error events are considered to compute Pr [error |F] and can be made to 
approach zero as n ^ 00. 

1) Ei: (X?(q-,s-,l),X^(q-,s^,l),Y") ^ r;[Q, Si, S2, Xi, X2, Y|q", s^] under the 
event F. By using Lemma [21 we can show that Pr[i?i|F] ^ as n ^ 00. 

2) E2:(X?(g",s?,mi),X5(q",s^,l),Y") G T,"[Q, Si, S2, Xi, X2, Y|q", s^] under the 
event F for all mi 7^ L It can be shown that Pr(_E2|-^) ^ as n ^ 00 by using 
Lemma [2] and Lemma [3] if < i?i < I(Xi; Y|Si, S2, X2, Q). 

3) E3:(X?(q",s^mi),X^(q",s^,l),Y") G Tf[Q, Si, S2, Xi, X2, Y|q", under the 
event F for all mi G Mi and for all s" 7^ s" and s" G TJ'JSi, S2IS2]. It can be 
shown that Pr(i?3|F) ^ as ^ 00 by using Lemma |2] and Lemma [3] if < i?i < 
I(Si,Xi;Y|S2,X2,Q)-e(Si|S2). 

4) : (X^(q", s?, 1), X5(q", s^, m2), Y") GTfiQ, Si, S2,Xi,X2,Y|q",s?, s^] under the 
event F for all m2 7^ 1. It can be shown that Fi^E^lF) ^ as n ^ 00 by using 
Lemma [2] and Lemma [3] if < i?2 < I(X2; Y|Si, Xi, S2, Q). 

5) E, :(X?(q",s^l),X^(q",s5,m2),Y") G r,"[Q, Si, S2, Xi, X2, Y|q", s^] under the 
event F for all m2 G M2, S2 7^ §2, and S2 G TJJ [Si, S2IS1]. It can be shown 
that Fi(E^\F) ^ as 77, ^ 00 by using Lemma [2] and Lemma [3] if < i?2 < 
I(X2,S2;Y|Si,Xi,S2,Q)-e(S2|Si). 

6) Ee :(X-(q", mi), X-(q", s^, m2), Y") G T,"[Q, Si, S2, Xi, X2, Y|q", s^] under the 
event F for all mi G Mi, m2 G M2, 7^ and s'^ G TJJ [Si, S2|s^]. It can 
be shown that Pr(_E'6|F) —> as 77 ^ 00 by using Lemma [2] and Lemma [3] if 
Ri + R2< I(Xi, S2, X2; Y|Si, Q) - H(S2|Si). 
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7) ^7 :(X-(q", mi), X-(q", s5, ms), Y") gT,"[Q, Si, S2,Xi,X2,Y|q",s?,s^^] under the 
event F for all mi G Mi, ma G Ms, (s^s^) ^ (s?,s^), and (s?, s^) G TJ^[Si,S2]. 
It can be shown that 'PI{E^\F) ^ as n ^ oo by using Lemma [21 and Lemma [3] if 
< i?i + i?2 < I(Si, Xi, S2, X2; Y|Q) - e(Si, S2). 

8) E8:(X^(q",s?,mi),X^(q",s^,m2),Y") Grf[Q,Si,Xi,S2,X2,Y|q",s?,s^]underthe 
event F for all mi ^ 1, m2 G M2, s'l ^ s^, and s'^' G T,'^ [Si, S2IS5]. It can 
be shown that 'Pi{E^\F) ^ as n ^ 00 by using Lemma [2] and Lemma [3] if 

< i?i + i?2 < i(Si, Xi, X2; YIS2, Q) - e(Si|S2). 

9) Eg :(X-(q", mi), X-(q", s^, m2), Y") G T,"[Q, Si, S2, Xi, X2, Y|q", s^] under the 
event F for all mi 7^ 1, and m2 7^ M2. It can be shown that Pr(£'9|F) ^ as n ^ 00 
by using Lemma [H and Lemma [3] if < -Ri + -R2 < I(Xi, X2; Y|Si, S2, Q). 

Then by using the union bound, Pr[error|F] < X]j=i P^l-^jl-^]- Pr[error|F] goes to zero 
as n — > 00 since Pr(_E'j) 0, where j = 1 to 9, as n — > 00 if rate pair (_Ri, R2) satisfies 
(|4l). It can be concluded that Pg" ^ as n ^ if rate pair (i?i, R2) satisfies ([4]). 
• Average distortions: We consider two cases in calculating the average distortion between 
the host sequence S" and the codeword X" for any given message m^ and q" G T"[Q]. If 
X7(q",Sr,mi)) G T^{Xi\(t,S'^) for any (q",S5',S^) G TJ^JQ, Si, S2], then the distortion 
between S" and X" is given by 

d,(Sr,Xr) = - ViV(x,,,Si|Sr,xnrf.(s„Xi), 
n ^-^ 

< A + erf,,^,, (23) 

where rfj max is the maximum distortion over the set SjXXj. If X"(q", S", m^)) G T"(Xj|q", s") 
for any (q",Si,S2) G TJ^ [Q, Si, S2], the distortion (ij(S",X") can be upper bounded by 
di-max- From error event Ei given F, we can show that Pr[X"(q", S", rrii)) G T"(Xj|q", S")] 
goes to zero as n ^ 00. We can then conclude that lim„^oo IE(ij(S", /"(S", Wj)) < Aj by 
letting e ^ and n ^ 00. 

This concludes that D^j^^ccl^i' ^2) ^ Cmac,c(Ai, A2). 
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B. Proof of Theorem |2] 

We prove the following lemmas which will be used in the proof of Theorem |2l 

Lemma 4: Let (Q^-, Si, S2, (X^, Xy), (Xg^, Xsj), Y,) G yj^^cl^ii' ^2i), let E"=i -^i = 1' 

\j > for j G {1, 2, . . . , n}, and let Aj = Yl%i ^^j^ij for i G {1, 2}. Then, there exists 

(Q, Si, S2, (Xi, Xi), (X2, X2), Y) G yWc(Ai, A2) 

such that 

n 

J2 A,[I(Si, Xi,; Y,|X2,-, S2, Q,)] = I(Si, Xi; Y|X2, S2, Q) (24a) 

i=i 

n 

J2 A,[I(S2, X2,; Y,|Si, Xy, Q,)] = I(S2, X2; Y|Xi, Si, Q) (24b) 

i=i 

n 

^ A, [I(Si, Xi„ S2, X2,; Y, I Q,)] = I(Si, Xi, S2, X2; Y| Q) (24c) 

Proof: If we prove the lemma for n = 2, then we can easily extend it to any value of n. Let 
n = 2 and let Ai + A2 = 1, Xj > for j = 1, 2. Let be a binary random variable such that 
Pr(Z = j) = Xj for j = 1, 2. Let 

(Q, Si, S2, (Xi, Xi), (X2, X2), Y) = ((Z, Q,), Si, S2, (Xi„ Xi,), (X2,, X2,), Y,). 

[ ((Q2, 2), Si, S2, (X12, X12), (X22, X22), Y2) if Z = 2; 

To show that (Q, Si, S2, (Xi, Xi), (X2, X2), Y) G !P^^q(Ai, A2), we have to check the conditions 
in Definition ([3]). We can easily show that (Q, Si, S2, (Xi, Xi), (X2, X2), Y) satisfies the first 
condition. To check the second condition, we observe that the Xi ^ (Si, S2, Q) ^ X2 follows 
as consequence of 

I(Xi, X2IS1, S2, Q) = AiI(Xii, X21IS1, S2, Qi) + A2l(Xi2, X22IS1, S2, Q2) = 

Similarly, Xi ^ (Si, Q) ^ S2 and Si ^ (S2, Q) ^ X2. We can easily verify that E(ij(Si,Xj) < 
AiAji + A2Aj2, for i = 1,2 using the distribution on (Q, Si, S2, (Xi, Xi), (X2, X2), Y). Since 
the distribution on (Q, Si, S2, (Xi, Xi), (X2, X2), Y) satisfies the conditions in Definition ([3]), we 
can conclude that (Q, Si, S2, (Xi, Xi), (X2, X2), Y) G !P\^^(^(Ai, A2). We can easily derive the 
equations ((24)) by using the distribution on (Q, Si, S2, (Xi, Xi), (X2, X2), Y). This completes the 
proof of Lemma. 
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Lemma 5: Let (Q^, Si, S2, Xi^, Xs^, Yj) G T^cl^ij' ^2j), let YJj=i^3 = 1, > for j G 
{1, 2, . . . , n}, and let Aj = Yl^j=i ^j^ij ^ ^ {I7 2}. Then, there exists (Q, Si, S2, Xi, X2, Y) G 
?^Ac(^i>^2) such that 

n 

A,[I(Xi,, Si; Y, 1X2,, S2, Q,)] = I(Xi, Si; Y|X2, S2, Q) (25a) 

^ A, [I(X2„ S2,; Y,|Xi„ Si„ Qj)] = I(X2, S2; Y|Xi, Si, Q) (25b) 

i=i 

^A,[I(Xi,,Si„X2„S2,;Y,|Q,)] = [I(Xi, Si, X2, S2; Y|Q)] (25c) 
Proof: We do not prove the lemma because proof is similar to the proof of Lemma IH 

Lemma 6: 3^^0,0(^1.^2) ^ %^^^ ,.{^\,^'^) and 3i^Ac,c(^i> ^2) ^ 3i^Ac,c(^'i> ^2) for 
any Ai < A^ and A'2 < A'g. 

Proof: This lemma can be directly proved from the fact that T^^(^(Ai, A2) C T5^^^(A']^, A2) 

and TJ:iac(Ai,A2)CT^^c(A'i,A2)- 

We are now ready to prove the Theorem |2l i.e., prove that for any sequence of MAC IE codes 

([2"^i], [2"^2]^/)(")^/)(")^^) with lim^^oo ^'e" = and lim„^oo A^"^ < A^, for i = 1,2, the 

rates must satisfy 

Consider a given code of block length n. The joint distribution on Wi x W2 x S" x §2 x 
X" X X2 X is given by 

p(wi,W2,s^,s^,x^,x^,y") = 

11/" \ n 

\j=l J i=l 

where, p(x"|wj, s") is 1 if x" = /"(wj, s") and otherwise, for i = 1,2. By Fano's inequality 
[30], the conditional entropy of (Wi, W2, Si, Sj) given Y" is bounded as 

e(Wi, W2, S^ S^|Y") < n{R, + R2 + log2(|Si||S2|))Pr + 1 = ne„, (26) 

for i = 1,2, where e„ — >^ as PJ* — 0. We can now bound the rate Ri as 

nRi < H(Wi) = e(Wi|W2) 

= e(Wi, s^|W2, s^) - e(s'^is^) 



e(Wi, S^ IW2, S^) - H(Wi, S^|W2, SI Y") 
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+ H(Wi, S^|W2, S^V^) - M(S^|S^) 

< H(Wi, S^l W2, S^) - H(Wi, S^|W2, S^, Y") - H(S^|S^) + ne„ 

H(Wi, S^|W2, X^, S^) - M(Wi, S^|Y", W2, X^, S^) - H(S^ |S^) + ne„ 
= I(Wi, S^; Y"| W2, X^, S^) - H(S^|S^) + ne„ 
= H(Y"|W2, X^, S^) - M(Y"|W2, X^, S^, Wi, S^) - H(S:^|S^) + ne„ 
i H(Y"| W2, XI S^) - H(Y"| W2, XI S^, Wi, X^) - H(S^ |S^) + ne„ 



^[H(Y,- 1 W2, X^, S^, Y^-i) - H(Y,| W2, XI SI W,, X^ Y^-^^ 



-H(Si,|S^,Sr^)] + ne„ 

n 

'i:^ J][M(Y,|W2,X^,S^,Y^-i) -H(Y,|Xi„Si„X2„S2,) -H(Si,|S2,)] +ne. 



n 

< 5^[e(Y,-|X2„S2,) -H(Y,-|Xi„Sy,X2„S2,) -H(Si,|S2,)] +ne. 



(9) " 



n 



= 5][I(Xi,-, Si,-; Y,-|X2,-, S2,-) - H(Si,|S2,)] + ne, 
where: 

(a) follows from the fact that Wi is independent of each other; and (Wi,W2) is independent 
of (S^S^). 

(6) follows from Fano's inequality, 

(c) follows from the fact that X2 is a function of (Wi, S"), 
(rf) follows from the fact that X" is a function of (Wi, S"), 
(e) follows from the chain rule of mutual information and entropy, 

(/) follows from the fact that Y, depends only on Xi,, X2j, S^, and 82^ by the memoryless 
property of the channel and Si, <-> 82^ ^ (Si~^, S^~^, S2J+1), 
(g) follows from removing conditioning. 
Hence, we have 

1 " 

Ri < -J2[IiXij,Si;Yj\X2j,S2)]-MiSi\S2)]+en 
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Similarly, we can bound R2 and Ri + R2 as 

1 " 

R2<- X;P(^2„ S2; Y,|Xy, Si)] - e(Si|S2) + e. 



n . 



1 " 

i?i + i?2 < -^[I(Xi„Si„X2,-,S2;Y,)] -H(Si|S2) + e„. 



n 



If the host random variables Si and S2 are correlated, we can clearly see that the random 
vector (Qj, Si, S2, Xi^, X2J, Y,) with p{c[j = j) = 1 belongs to set 

T^Ac(^Mi(Sii' ^li)]) IE[(i2(S2j, Xij])) for j G {1, 2, ... , n}. According to Lemma[5l there exists 
a random vector (Q, Si, S2, Xi, X2, Y) G ^Wcl^ E;=i EMi(Si,, Xy)], i ^J^, E[t/2(Si„ Xy)]) 
such that the following is true 



1 " 



- J][I(Xi„ Si; Y,|X2„ S2)] = I(Xi, Si; Y|X2, S2, Q) 

^ i=i 

1 " 

- 5^[I(X2„ S2; Y,|Xi„ Si)] = I(X2, S2; Y|Xi, Si, Q) 

^ i=i 
1 " 

- 5^[I(Xi„ Si„ X2,, S2; Y,)] = I(Xi, Si, X2, S2; Y|Q) 

As n — *• 00, we can conclude the following 

/ 1 " 1 " 

eMAc,c(Ai, A2) C O^^Acc lim - V E[rfi(Sij, X^)], lim - VE[rf2(Si,-, Xi,)] 

\ i=i i=i 

C 3^°^Ac,c(Ai,A2) (29) 

where (a) follows from the Lemma |6l 

If the host random variables Si and Si are independent, we can obtain the following from the 
condition that the messages Wi and W2 are independent. 

p(Xij,X2j\Sij, S2j) = p{^lj\Sij)p{X2j\S2j) ■ 

Then, we can clearly see that the random variable tuple (Q^, Si, S2, (Xi^, Xi^), (X2J, X2j), Y^) 
with p{qj =j) = l belongs to set T*MAc(EMi(Sij, Xi^)], E[d2(S2i, X^)]) for j G {1,2, . . . ,n}. 
According to Lemma |4l there exists a random vector 

^ n 1 " 

(Q, Si, S2, (Xi, Xi), (X2, X2), Y) G ^MAcl- Yl Xi,)], - 5^ E[rf2(Si„ Xi,)]) 

^ j=i ^ i=i 
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such that (|28l) is true. As n ^ oo, we can conclude the following 

/ 1 " 1 " \ 

eMAc,c(Ai,A2) C3i;^Acc lim - V E[c/i(Si,-, X^)], lim - V E[c/2(Si,-, Xi,) ] 

' \ n^oo 72 ' n^oo J7, ' / 

V i=i i=i / 

C D^Wc,c(Ai,A2) (30) 

where (a) follows from the Lemma [6l This completes the proof of Theorem [2l 

C. Proof of Theorem 13 

In this section, we show that 3^'i3,(A) C eij'(A). Fix the random vector (U, S, X, Y, Z) G T(A). 
For each ra, we construct a ([2"^^], [2"^^] , ra) BC IE code as follows. 

• Code construction : Generate [2"^^] 2"(''('-^'^)+'') U" sequences drawn according to 11^=1 Pi'^j)- 
Distribute these sequences randomly into [2"^^] bins such that each bin has 2'^^'"^^'^)+'') 
sequences. Label all sequences U" in bin m2 E {1,2, ... , [2"^^]} as U"(m2). For each 
(S",U") G T"[S,U], generate [2"-^^] X" sequences according to YYj=iPi^j\^j^ ^j)- Label 
these sequences as X"(S", U", mi), where (S", U") G Tf [S, U] and mi G {1, 2, ... , [2"^i] }. 
These codebooks are revealed to the encoder and both the decoders. 

• Encoder : The encoder, upon observing S" G T"[S] at the output of the host source, embeds 
message W2 G {1, 2, ... , [2"^^]} into the host sequence by looking for a U'* in bin W2 
such that U"(W2) G T"[S,U|S"]. If such a sequence U"(W2) does not exist, the encoder 
declares an error; otherwise, the encoder embeds message Wi G {1,2,..., [2"^^]} into the 
host sequence S" by choosing the codeword X"(S", U"(W2), Wi). 

• Decoder 1: Decoder 1, upon receiving Y", which is a distorted or attacked version of the em- 
bedded sequence X", looks for U"(m2), ms G {1, 2, . . . , [2"^^]} ^^ch that (U"(m2), Y") G 
T"[U, Y]. If a unique codeword U"(m2) does not exist. Decoder 1 declares an error; oth- 
erwise. Decoder 1 declares that W2 = "^2- Upon decoding the sequence U"(W2), Decoder 1 
looks for X"(s", U"(W2), mi) such that (X"(s", U"(W2), m^, Y") G T^[S, U, X, Y|s", U"(W2)] 
for each s'^ G T,"[U, S|U"(W2)] and mi G {1, 2, . . . , [2"^^] }. If a unique codeword 
X"(s", U"(W2)7 ^1) exists. Decoder 1 declares that (Wi,S2) = (mi,s"); otherwise, it 
declares an error. 

• Decoder 2: Decoder 2, up on receiving Z", which is a degraded version of Y", looks for 
U"(m2), m2 G {1, 2, . . . , [2"-f^2^ } such that (U"(m2), Z") G T,"[U, Z]. If a unique codeword 
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U'*(m2) exists, Decoder 2 declares that W2 = otherwise. Decoder 2 declares an error. 
• Probability of error: The average probability of error is given by 

P^^ p(s")Pr[error|s''] 

s"6S" 

< p(s")+ Y P(s")Pr[error|s"], (31) 

where the first term, Pr[s" ^'"[S]], goes to zero as n — > 00 by the strong asymptotic 
equipartition property (AEP). Without loss of generality, it can be assumed that the output 
of the host source is s", and the message pair (Wi, W2) = (1, 1) is to be embedded in to 
the host sequence s". Let F be the event that the host source output is s^. To compute 
Pr[error|F], let us write the error event as Eq\J EiD E2U E^, where: 

1) Eq is the event that there is no U"(l) such that U"(l) e 7;"[U,S|s"]. Using well- 
known rate-distortion arguments, the probability of this event approaches zero as n 
goes to infinity since each bin has 2"(''(^'^)+^) U" sequences. 

Conditioned on the event FHEq, it can also be assumed that U"(l) is jointly strongly 
typical with the host sequence s'^. Hence, the embedded sequence X"^(s"^, U"(l), 1) is 
generated and transmitted from the encoder. 

2) El is the event that 

(U"(l), X"(§", U"(l), 1), Y", Z") ^T^[S, U, X, Y, Z|s"]. 

By the strong AEP, we can show that Ft[Ei\F fl Eq] ^ as n ^ 00. 

3) E2 := E2A U (El-^ n ^2,2), where ^2,1 is the event that (U", Y*^) G [U, Y] for U" ^ 
U'"(l), and ^2,2 is the event that (X'^(s", U"(l), mi), Y'^) e T;^[S, U, X, Y|S", U'"(l)] 
for mi 7^ 1 or s" e {s'* : s'* ^ s",s" G T,"[U, S|U"(1)]}. It can be shown that 
Pr[£;2,i |F n ^ as n ^ 00 if i?2 < I(U; Y) - I(U; S) and that Pr(£;2,2|i^ n E^ n 
£;|^i) ^ as n ^ 00 if i?i < I(S, X; Y|U) - H(S|U). 

4) Es is the event that (U",Z") e Tf[U,Z] for U" 7^ U"(l). Using Gel'fand-Pinsker 
arguments, it can be shown that Pr[£^3|F fl £^q] ^ as n ^ 00 if -R2 < Z) — 
I(U;S). Because the broadcast channel is degraded, this constraint on R2 is more 
restrictive than the previous constraint. 

Thus, by the union bound, it can be shown that PJ* goes to zero as n — > 00 if (i?i, P2) e ^b'- 
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• Average distortion: Since (X", s") is jointly strongly typical with high probability and the 
distribution belongs to 5'(A), it can be shown that the average distortion D^") associated 
with the generated code satisfies the distortion constraint A as n ^ oo as i n the Proof of 
Theorem [U 

D. Proof of Theorem |?] 

In this section, we show that Qb'{,^) ^ If we are given a sequence of ( [2"^^] , [2"^^^ , D^" 

BC IE codes, i.e., X" = /(Wi,W2,S"), gle'i^'') = (Wi,W2,S"), and ^£^,(2") = W2, with 
lim„^oo-Pe" — ^^'^ linin^oo -D*^") < A, then we show that the rate pair {Ri,R2) must satisfy 
(fTT)) for some ((U, V), S,X, Y, Z) G J'(A). Consider a given code of block length n. The joint 
distribution on Wi x W2 x S" x X" x x induced by the code is given by 

p(wi,W2,s",x",y",z'^) = 
1 



p(s")p(x"|Wi,W2,s") 



n 

Y[p{y,\Xj,s,)p{zj\yj) 



n 

X 
1=1 



where, p(x"|wi, W2, s") is 1 if x" = /"(wi, W2, s") and otherwise. We can bound the rate Ri 
as follows: 

nRi <H(Wi) 

= H(Wi, S'^lWs) - M(S"|W2) 

=H(Wi, s"iW2) - e(Wi, s"iW2, y") 
+ e(Wi, s"iW2, Y") - e(S"|W2) 

(fe) 

<I(Wi, S"; Y"|W2) - H(S"|W2) + ne„ 



5];[I(Wi,S";Y,|W2,Y^'-^) -e(S,|W2)] +ne„ 



(d) 



^[H(Yj|W2, Y^'-i) - H(Yj|W2, Wi, S", X^ 
-e(S,-|W2)]+ne„ 
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W2, r-\ r-') - H(Y, |S„ X,) 

J=l 

-H(S,-|W2)]+ne„ 

(/) 

< J][H(Y,|W2,Z^-i) -e(Y,|S„X„W2,Z^-i) 
J=l 

-m{Sj\W2,Z^-^)]+nen 

n 

= J2 X,-; Y,| W2, r-^) - H(S,-| W2, Z^-') + nsn (32) 
where, e„ — > as n — > 00, and 

(a) follows from the fact that Wi, W2 and S" are mutually independent, 

(b) follows from Fano's inequality, 

(c) follows from the chain rule and the fact that S" is i.i.d. and independent of W2, 

(d) follows from the fact that X" is a deterministic function of (Wi, W2, S"^), 

(e) follows from degraded and memoryless properties of the broadcast channel, and 

(f) follows from removing conditioning in the positive term and introducing conditioning in the 
negative term. 

We can also bound the rate R2 as follows: 

ni?2 <H(W2) 

<I(W2; Z") + nen 

n 

= J][I(W2, S^+,; - I(W2, S,-; Z^-^)] + ne„ 

(b) 

< 5^[I(W2, S«+,; Z^-i) + I(W2, S^+,; Z,|Z^-^) 

- I(W2, S^+i; Z^-i) - I(S,-; Z^-^|W2, S^+J] + ne„ 

n 

= J][I(W2, S^+i; Z,|Z^-^) - I(S,; Z^-^|W2, S^+J] + ne„ 

n 

= ^[H(Z,|Z^-i) -M(Z,|W2,Z^-\S^+J 
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- e(S,-|W2, S;+i) + H(S,|W2, r-\ S^^+J] + ne„ 
<X^[H(Z,)-H(Z,|W2,Z^--\S^+i) 

- M(S,) + H(S, IW2, Z^-\ S^+i)] + ne„ 

n 

= 5^[I(W2, Z^-\ S;+i; Z,) - I(W2, Z^-\ S^+i; S,)] + ne, (33) 

where, e„ ^ as n ^ cxo, and 

(a) follows from Fano's inequality, 

(b) follows from applying the chain rule on [Z^^^^Zj) and (S"_,_i,Sj) in the first and second 
mutual information expressions, respectively, and 

(c) follows from removing conditioning and the fact that S" is i.i.d. and independent of W2. 
Let := {W2, Z-'-i} and := {S^^J for j = 1, 2, . . . , n. We can then write dH and ^ 

as 

Ri <I(S,X;Y|Q,U) -H(S|Q,U) + e„, (34a) 
R2 <I(U, V; Z|Q) - I(U, V; S|Q)] + e„, (34b) 

where Q takes values in the set Q G {1, 2, . . . , n} with equal probability and the joint probability 
distribution on (S, Q, U, V, X, Y, Z) is p(S = s,Q = q,U = u, V = v, X = x)j9(y |x, s)p(z|y), 
with 

p(S = s, Q = q, U = u, V = V, X = x) = 

p(s)p(q)p(Ug = u, Vg = v|s, q)p(Xg = x|s, q,u,v). 

Finally, we can write (|34l) as 

i?i <I(S, X; Y|U) - H(S|U) + ne„, 

i?2 <I(U, V; Z) - I(U, V; S) + ne„, 

where U := (Q,U), since I(U, V;Z|Q) < I(Q,U, V;Z) and I(Q;S) = 0. 
Given any 5 > 0, the associated distortion Z}("\ for sufficiently large n, satisfies 

=Erf(X", S") 
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X, Sj = s)dix, s) 



j=l x,s 



5]p(X = x,S 



s)d(x, s) 



x,s 



:Erf(X,S). 



As n ^ oo and 5 ^ 0, ((U, V), S, X, Y, Z) G T(A) and (i?i,i?2) G 01^,. Thus, eij/(A) C 

E. Proof of Theorem |5] 

1) Achievability: In this section, we show that 3^^,(A) C 6,^/ (A). Fix the random vector 
(U,S,X,Y,Z) G T(A). For each n, we construct a ([2"^^], [2"^^^ n) BC IE code as 



• Code construction: At Encoder, for each s" G S", generate 2"^^ sequences drawn ac- 
cording to nj=iP(''^i|sj)- Denote these sequences as U"(s", m2), where m2 G {1,2,..., 2"^^} 
For each pair (s",U"), generate 2"^^ X" sequences drawn according to 11^=1 l^j > ^j)- 
Call these sequences X"(S", mi, 7722) where nii G {1,2,..., 2"^^}. In this way, the code- 
book is generated at the encoder and revealed to both the decoders. 

• Encoding: Encoder, upon observing s" at the output of host source, sends messages Wi G 
{1,2,..., 2"-f^i} and W2 G {1, 2, . . . , 2'^^2| transmitting the codeword X"(s'', Wi, W2). 
In this way, the codeword X" is chosen and transmitted from the encoder for a given host 
sequence S", and a given message pair (Wi, W2). 

• Decoder 1: Decoder 1, up on receiving the channel output Y", looks for U"(s", 777,2) such that 
(U"(s", 7772), Y") G [U, Y|s"] for all s" G T,"^ [S]. If a unique codeword U"(s", 7712) exists. 
Decoder 1 again looks for X"(s", ttii, 7772) such that (X"(s", 7771, 7772), Y") G T;^[X, Y|s", U"(s", 
If a unique codeword X"(s", ttii, 7772) exists. Decoder 1 declares that (Wi,S2) = (777,1,5"). 
In this way, the message intended for Decoder 1 and the host sequences are decoded at 
Decoder 1. 

• Decoder 2: Decoder 2, up on receiving the channel output Z", looks for U"(s",7772) such 
that (U"(s",7772),Z") G T,"[U,Z|s"] for all s" G TJ^[S]. If a unique codeword U"(s",77i2) 
codeword exists. Decoder 2 declares that (W2,S") = (7772,5"). Otherwise, Decoder 2 
declares an error. In this way, the message intended for Decoder 2 and the host sequences 
are decoded at Decoder 2. 



follows. 
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• Probability of error: The average probability of error is given by the following 

P^= J2 p(s")Pr[error|s"] 

< P(s")Pr[error|s"], 

= E E P(s")Pr[E(l)UE((2)|s"], (35) 

where E{i) is the event that the error is made at Decoder i, for i = 1,2. The first term, 
Pr[s" ^ T^"[S]], in the right hand side expression of (|35l ) goes to zero as n ^ oo by 
Lemma |2l 

Without loss of generality, it can be assumed that the output of the host source is s", and 
(Wi, W2) = (1, 1) is being transmitted from the encoder. Hence, the codeword X"(s", 1, 1) 
is transmitted from the encoder. Let Fi be the event that s" G TJJ [S] is output of the host 
source. 

The following error events are considered to compute Pr[E(2)|F] and can be made to 
approach zero as n ^ 00. 

1) El. (U"(s",l),X"(s",l,l),Y",Z") ^ T;^ [S, U,X,Y,Z I s"] under the event F. By using 
Lemma [21 we can show that Prf-EilF] ^ as n ^ 00. 

2) E2: (U"(s",m2),Y") G T,"[S,U,Z|s"] under the event F n E^ for all 7712 ^ 1. It 
can be shown that Pr(£'2|-^) — > as n — > 00 by using Lemma |2] and Lemma |3] if 
< i?2 < I(U;Z|S). 

3) P3: (U"(s", m2), Y") G T,"[S, U, Z|s"] under the event FnEf for all mi and s" ^ s". 
It can be shown that Pr(i?3|F) — > as n — > 00 by using Lemma [21 and Lemma [3l if 
< i?2 < I(U,S;Z) - H(S). 

From the all above error events, it can be concluded that Pr[i?(l)|F] ^ as n — > 00 
if < i?2 < S;Z) — H(S). The following error events are considered to compute 
Pr[E(l)|F] and can be made to approach zero as n ^ 00. 

1) E4:(U"(s", m2), Y") G T^[S, U, Y|s"] for mi ^ 1 or s" ^ s". By considering the error 
events similar to E2 and E-^, it can be shown that Pr^E^lF, E^) ^ as n ^ 00 if 
< P2 < I(U,S; Y) -e(S). 
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2) E5:(X"(s",mi,l),Y") G 7;"[S, U, X, Y|s", U"(s", 1)] for mi ^ 1. It can be shown that 
Pr(E5|F, El, E^) as n oo if < i?i < I(X; Y|S, U). 
Then by using the union bound, Pr[E(l) U E(2)|F] goes to zero as n ^ oo if rate pair 
(i?i,i?2) satisfies (fT2l) . It can be concluded that P^" as n — > if rate pair {Ri, R2) 
satisfies (fT2l) . 

• Average distortions: Since (X", s") is jointly strongly typical with high probability and the 
distribution belongs to ^(A), it can be shown that the average distortion D^") associated 
with the generated code satisfies the distortion constraint A as n ^ 00 as in the Proof of 
Theorem [U 

2) Converse: We show that any sequence of (\2'^^'^~\, \2'^^-^~\, D^'^\n) codes, i.e., X" = 
/(Wi,W2,S"), 9lc.m = (Wi,W2,S"), and (?2V(Z") = (W2,S"), with lim™ = 
and lim„^oo ^^"^ < A, the rate pair {Ri, R2) must satisfy ^ for some (U, S, X, Y, Z) G ^(A). 
Consider a given code of block length n. The joint distribution on Wi x W2 x S" x X" x x 
induced by the code is given by 

p(wi,W2,s",x",y",z") = 

^ p(s")p(x"|Wi,W2,s'') 



[2"-f^i] [2"-f^2] 

n 

Wp{yj\xj,Sj)p{zj\yj), 



X 
i=l 

where, ^(x"!^!, ^2, s") is 1 if x" = /"(wi, W2, s") and otherwise. 
We can bound the rate Ri as follows: 

nRi <e(Wi) 

^=VWi|W2,S") 

=H(Wi|W2,S") -H(Wi|W2,S'',Y'^) +H(Wi|W2,S",Y") 
<I(Wi;Y"|W2,S")+ne„ 

n 

= ^I(Wi;Y,|W2,S",Y^''i) + ne„ 

n 

= J][e(Y, IW2, S^ V-^) - H(Y,|Wi, W2, S", V-^)] + nen 
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i J][M(Y, |W2, S^ Y^-\ Z^-^) - H(Y, |Wi, W2, S^ Y^-\ Z^-^)] + ne„ 
< ^[H(Y,-| W2, S", Z^-i) - H(Y,-| Wi, W2, S^ Z'-\ X^)] + ne^ 

n 

J2mYj\W2, S", Z^-^) - H(Y,-|X,-, S,-)] + ne„ 

n 

^[M(Y,|S„ U,) - M(Y,|X„ S,)] + nen 

n 

= ^I(X,;Y,|S,-,U,) + ne„, (36) 

where, 

(a) follows from the fact that Wi, W2 and S" are mutually independent, 

(b) follows from Fano's inequality and — > as n — > 00, 

(c) follows from Y, ^ (W2, S", Y^-^) ^ Z^-^ and Yj ^ (Wi, W2, S", Y^-^) ^ Z^-^ 

(d) follows from H(Y,|W2,S",Y^-SZ^-^) < H(Yj|W2, S", Z^-^), and X" is a deterministic 
function of (Wi, W2,S^% 

(e) follows from memoryless properties of the broadcast channel, and 

(f) follows from Uj := {W2, Si"\ J. 
We can also bound the rate R2 as follows: 

ni?2 <H(W2) 

<e(W2, S") - H(S") 

<I(W2, S"; Z") - H(S") + nsn 

n 

= J][I(W2,S'^;Z,|Z^-i)-H(S,|S^'-i)]+nen 

n 

^[H(Z^|Z^-^) - H(Z, |W2, S", Z^-') - m{Sj)] + nen 
i=i 

< f^[H(Z,) - H(Z,|U„ S,) - e(S,)] + ne„ 

n 

= J][I(U„S,;Z,)-e(S,)]+ne„ 
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where, 

(a) follows from the fact that Wi, W2 and S" are mutually independent, 

(b) follows from Fano's inequality and e„ ^ as n ^ 00, 

(c) follows from the fact that S" is an i.i.d. random vector, 

(d) follows from M{Zj\Z^-^) < m{Zj), and U^- := {W2, S{'\ S]^^} . 
We can then write (l36l) and (I37al) as 

i?i<I(X;Y|Q,S,U)+e„, (37a) 
i?2 <I(U,S;Z|Q) -H(S) +e„, (37b) 

where Q takes values in the set Q G {1, 2, . . . , n} with equal probability and the joint probability 
distribution on (S, Q, U, X, Y, Z) is p(S = s, Q = q, U = u, X = x)p{y\x, s)p{z\y), with 

p{S = s,Q = q,U = u, X = x) = 

p(s)p(q)p(Uq = u|s, q)p{Xq = x|s, q,u). 

Finally, we can write (l37l) as 

<I(X; Y|U,S) + ne„, 
R2 <I(U,S;Z) -H(S) +ne„, 

where U := (Q,U), since I(U,S;Z|Q) < I(Q,U,S;Z). 

Given any 5 > 0, the associated distortion D^^\ for sufficiently large n, satisfies 

=Ed(X", S") 
1 " 

= -^^p(Xj = X, Sj = s)d{x,s) 

j=l x,s 

= ^]9(X = X, S = s)d(x, s) 

x.s 

=Erf(X,S). 

As n ^ 00 and 5 ^ 0, (U, S, X, Y, Z) G T(A) and R2) G 6^. 
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